Introduction¶
This project explores customer investment behavior across various asset classes using a dataset of 12,000 transactions over multiple years. The goal is to uncover trends in asset preferences, track average returns over time, and segment users by investment size and frequency.
The dataset¶
This dataset contains the following features:
Customer ID: The customer's IDGender: Male or FemaleAge: The customer's ageRegion: The customer's stateDate:The date the investment was madeInvestment Type: Stock, bond, crypto, etc.Amount Invested: Customer's CapitalROI: Customer's return on investmentCustomer Join Date: Date customer joinedRisk Profile: Customer's risk profileLast Investment Date: Date last investment was made by customer
| CustomerID | Gender | Age | Region | Date | InvestmentType | Amount | ROI | CustomerJoinDate | RiskProfile | LastInvestmentDate | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INV00088 | Male | 20 | Port Harcourt | 11/2/23 | Mutual Fund | 14174.79 | 668.30 | 1/25/22 | Medium | 9/27/24 |
| 1 | INV00810 | Male | 39 | Kano | 10/9/24 | Fixed Income | 28842.26 | 641.62 | 11/18/23 | Medium | 12/26/24 |
| 2 | INV00684 | Female | 41 | Abuja | 7/7/23 | Mutual Fund | 4411.04 | 216.18 | 5/14/22 | Medium | 5/9/24 |
| 3 | INV01683 | Other | 23 | Port Harcourt | 6/24/23 | Crypto | 18967.37 | 4290.20 | 2/11/23 | Medium | 6/8/24 |
| 4 | INV02582 | Female | 67 | Abuja | 10/14/24 | Fixed Income | 1440.70 | 45.99 | 10/4/21 | Medium | 10/14/24 |
Data Cleaning & Preparation¶
====== Data Information ======
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12000 entries, 0 to 11999
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 12000 non-null object
1 Gender 12000 non-null object
2 Age 12000 non-null int64
3 Region 12000 non-null object
4 Date 12000 non-null object
5 InvestmentType 12000 non-null object
6 Amount 12000 non-null float64
7 ROI 12000 non-null float64
8 CustomerJoinDate 12000 non-null object
9 RiskProfile 12000 non-null object
10 LastInvestmentDate 12000 non-null object
dtypes: float64(2), int64(1), object(8)
memory usage: 1.0+ MB
None
====================
====== Descriptive Statistics ======
CustomerID Gender Age Region Date InvestmentType \
count 12000 12000 12000.000000 12000 12000 12000
unique 2935 3 NaN 5 1624 5
top INV02563 Female NaN Kano 2/8/24 Mutual Fund
freq 12 5876 NaN 2547 25 2440
mean NaN NaN 44.494167 NaN NaN NaN
std NaN NaN 14.411880 NaN NaN NaN
min NaN NaN 20.000000 NaN NaN NaN
25% NaN NaN 32.000000 NaN NaN NaN
50% NaN NaN 45.000000 NaN NaN NaN
75% NaN NaN 57.000000 NaN NaN NaN
max NaN NaN 69.000000 NaN NaN NaN
Amount ROI CustomerJoinDate RiskProfile \
count 12000.000000 12000.000000 12000 12000
unique NaN NaN 1275 3
top NaN NaN 12/6/20 Medium
freq NaN NaN 37 5150
mean 10239.886546 868.152366 NaN NaN
std 10084.656504 3506.280029 NaN NaN
min 1000.000000 -52934.200000 NaN NaN
25% 3035.020000 77.032500 NaN NaN
50% 7161.380000 328.600000 NaN NaN
75% 14144.792500 958.127500 NaN NaN
max 127979.250000 66492.460000 NaN NaN
LastInvestmentDate
count 12000
unique 764
top 12/28/24
freq 93
mean NaN
std NaN
min NaN
25% NaN
50% NaN
75% NaN
max NaN
====================
====== Unique Entries ======
CustomerID 2935
Gender 3
Age 50
Region 5
Date 1624
InvestmentType 5
Amount 10890
ROI 11560
CustomerJoinDate 1275
RiskProfile 3
LastInvestmentDate 764
dtype: int64
====================
(None, None, None)
From the descriptive output above, we can see that there are 12000 non-null rows and 11 columns. We also have 3 numeric columns (Age, Amount, and ROI), while others are non-numeric. The data also showed that this company has 2935 unique customers, with ages ranging between 20-69 years. From the dataset, we also observe that investment amount ranges from ₦1,000 to ₦127,979 with a mean of ₦10,240; return on investment is highly variable (-₦52,934 to +₦66,492), with an average return of ₦868.
Looking at the dataset through the data cleaning lens, we notice that the dataset is largely clean, apart from the datatype issues with the date columns (Date,CustomerJoinDate, and LastInvestmentDate). This will be addressed immediately.
| CustomerID | Gender | Age | Region | Date | InvestmentType | Amount | ROI | CustomerJoinDate | RiskProfile | LastInvestmentDate | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INV00088 | Male | 20 | Port Harcourt | 2023-11-02 | Mutual Fund | 14174.79 | 668.30 | 2022-01-25 | Medium | 2024-09-27 |
| 1 | INV00810 | Male | 39 | Kano | 2024-10-09 | Fixed Income | 28842.26 | 641.62 | 2023-11-18 | Medium | 2024-12-26 |
| 2 | INV00684 | Female | 41 | Abuja | 2023-07-07 | Mutual Fund | 4411.04 | 216.18 | 2022-05-14 | Medium | 2024-05-09 |
| 3 | INV01683 | Other | 23 | Port Harcourt | 2023-06-24 | Crypto | 18967.37 | 4290.20 | 2023-02-11 | Medium | 2024-06-08 |
| 4 | INV02582 | Female | 67 | Abuja | 2024-10-14 | Fixed Income | 1440.70 | 45.99 | 2021-10-04 | Medium | 2024-10-14 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 11995 | INV01870 | Female | 47 | Kano | 2023-07-09 | Mutual Fund | 1193.03 | 89.93 | 2022-06-13 | Medium | 2024-12-07 |
| 11996 | INV01632 | Male | 38 | Kano | 2022-03-05 | Fixed Income | 21248.03 | 788.98 | 2021-06-06 | Low | 2023-09-27 |
| 11997 | INV01144 | Male | 31 | Lagos | 2021-01-06 | Crypto | 6204.79 | 30.79 | 2020-09-01 | Medium | 2024-11-20 |
| 11998 | INV00882 | Female | 44 | Kano | 2024-01-19 | Fixed Income | 17048.94 | 254.48 | 2022-02-12 | Low | 2024-01-19 |
| 11999 | INV02898 | Female | 54 | Enugu | 2021-04-28 | Real Estate | 13841.56 | 990.36 | 2021-02-22 | High | 2024-12-22 |
12000 rows × 11 columns
<class 'pandas.core.frame.DataFrame'> RangeIndex: 12000 entries, 0 to 11999 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 CustomerID 12000 non-null object 1 Gender 12000 non-null object 2 Age 12000 non-null int64 3 Region 12000 non-null object 4 Date 12000 non-null datetime64[ns] 5 InvestmentType 12000 non-null object 6 Amount 12000 non-null float64 7 ROI 12000 non-null float64 8 CustomerJoinDate 12000 non-null datetime64[ns] 9 RiskProfile 12000 non-null object 10 LastInvestmentDate 12000 non-null datetime64[ns] dtypes: datetime64[ns](3), float64(2), int64(1), object(5) memory usage: 1.0+ MB
<class 'pandas.core.frame.DataFrame'> RangeIndex: 12000 entries, 0 to 11999 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 CustomerID 12000 non-null object 1 Gender 12000 non-null object 2 Age 12000 non-null int64 3 Region 12000 non-null object 4 Date 12000 non-null datetime64[ns] 5 InvestmentType 12000 non-null object 6 Amount 12000 non-null float64 7 ROI 12000 non-null float64 8 CustomerJoinDate 12000 non-null datetime64[ns] 9 RiskProfile 12000 non-null object 10 LastInvestmentDate 12000 non-null datetime64[ns] 11 Year 12000 non-null int32 12 Month 12000 non-null int32 13 CustomerTenureDays 12000 non-null int64 14 IsHighROI 12000 non-null int64 dtypes: datetime64[ns](3), float64(2), int32(2), int64(3), object(5) memory usage: 1.3+ MB
Exploratory Analysis¶
From early 2020 to late 2024, monthly investments showed a consistent upward trend, indicating growing investor activity or confidence, with a notably steeper increase in the rate of investment growth occurring around early 2021, potentially due to market recovery, policy changes, or successful financial products. This growth culminated in peak monthly investment values exceeding 4.5 million in early to mid-2024, suggesting a period of heightened investor interest or market performance.
Next, we turn our attention to asset classes to understand how investors interacted with different investment instruments during the period.
InvestmentType Amount 2 Mutual Fund 10480.85 3 Real Estate 10263.28 4 Stocks 10242.47 1 Fixed Income 10113.90 0 Crypto 10097.39
The average investment amounts across all asset classes are relatively uniform, ranging narrowly from approximately ₦10,100 to ₦10,500, suggesting a balanced investment strategy among investors. Notably, Mutual Funds exhibited the highest average investment at ₦10,480.85, indicating stronger investor confidence or popularity, while Crypto shows the lowest average investment at around ₦10,097.39, potentially reflecting perceived volatility or risk aversion. The minimal gaps between these averages imply a tendency towards even diversification rather than a heavy preference for a single asset class.
The chart above visualizes the interplay between investment returns, investor age, and risk tolerance, categorized as Low, Medium, and High. Each data point on the scatter plot represents a single investment, positioned horizontally by the investor's age and vertically by the achieved Return on Investment (ROI). The color of each point further distinguishes the investor's risk profile: blue signifies Medium risk, green indicates Low risk, and red denotes High risk.
Analysis of the chart reveals several key patterns. Investors across all age groups and risk profiles have experienced a diverse range of ROI outcomes, spanning from significant gains to substantial losses. Notably, there is no discernible trend suggesting a direct correlation between investor age and investment returns across the different risk categories. Furthermore, a higher concentration of investments appears around the lower positive and negative ROI values, implying that extreme investment outcomes might be less common. High-risk investments exhibit a wider dispersion of ROI, including both higher potential gains and losses, while low-risk investments tend to cluster within a narrower range of returns, typically closer to the zero mark.
The correlations between Amount and Age (-0.0), Amount and CustomerTenureDays (-0.01), ROI and Age (0.01), and ROI and CustomerTenureDays (0.01) are all very weak or negligible, indicating practically no linear relationship between these pairs of variables. A weak positive correlation (0.23) exists between Amount and ROI, suggesting a slight tendency for larger investments to be associated with somewhat higher returns, although this relationship is not strong.
Region Amount 0 Abuja 10542.71 4 Port Harcourt 10393.56 1 Enugu 10206.16 2 Kano 10102.78 3 Lagos 9983.60
The bar chart above displays the average investment amount across five different regions: Abuja, Port Harcourt, Enugu, Kano, and Lagos. The height of each bar corresponds to the average investment amount for that specific region, with Abuja exhibiting the highest average investment at approximately ₦10,542.71, followed closely by Port Harcourt at around ₦10,393.56 and Enugu at roughly ₦10,206.16. Kano shows a slightly lower average investment of about ₦10,102.78, while Lagos has the lowest average investment among the displayed regions at approximately ₦9,983.60. Overall, the average investment amounts are relatively similar across these regions, with Abuja showing a marginal lead and Lagos having the lowest average investment.
Insights & Recommendations
Capitalize on the Growth Trend: Leverage the observed upward trend in monthly investments from 2020 to late 2024 by understanding and potentially reinforcing the factors that contributed to this growth, particularly the significant surge after early 2021. This might involve further investigating the market recovery, policy changes, or the success of specific financial products that drove this increase.Explore Mutual Fund Popularity: Further analyze why Mutual Funds have the highest average investment. Understanding the specific characteristics or investor perceptions driving this preference can inform strategies for other asset classes.Address Crypto Investment Aversion: Investigate the reasons for the lower average investment in Crypto despite its hype. Understanding if this stems from perceived volatility, risk aversion, or lack of understanding can guide efforts to educate investors or potentially adjust offerings.Monitor Diversification Trends: Continue to observe the relatively uniform average investment across asset classes, which suggests a balanced diversification strategy. Understanding the drivers behind this could inform the development of products or advice that aligns with this preference.Further Investigate ROI by Risk Profile and Age: While no strong correlation was immediately apparent, the wider ROI range for high-risk investments and the narrower range for low-risk investments align with expectations. Further analysis could explore specific investment strategies within each risk profile and their performance across different age groups.Regional Investment Strategies: Acknowledge the slightly higher average investment in Abuja and the slightly lower average in Lagos. Further investigation into the economic factors or investor demographics in these regions could inform targeted strategies or product offerings.
[NbConvertApp] Converting notebook InvestmentTrendsAnalysisProject.ipynb to html [NbConvertApp] Writing 5386824 bytes to InvestmentTrendsAnalysisProject.html [NbConvertApp] Redirecting reveal.js requests to https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.5.0 Serving your slides at http://127.0.0.1:8000/InvestmentTrendsAnalysisProject.html Use Control-C to stop this server